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FOREWORD 


This report describes an assurance assessment of a representative con- 
temporary digital flight control system stressing the use of various 
methods in a complementary manner. The work was performed between 
February 1, 1982, and SeptembeT/^Q^ 1982, under contract number MAS2- 

11179. The work was sponsored and directed by the Federal Aviation 
Administration Technical Center, with the contract administered through 
the National Aeronautics and Space Administration - Ames Research Center 
under interagency agreement NAS NMI 1052.51 (Task Order DOT-FAA-77WAI-738) , 
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i. introduction and summary 


Under the PAA Technical Center's Digital Systcai Program (182-3^0-100) , 
an Integrated assurance assessment of a contemporary digital flight control 
system was performed. The assurance methods of fault tree analysis, 
automated reliability prediction, failure mode and effect analysis, and 
fault insertion were applied in a complementary way to address the need for 
a workable approach to confirming the airworthiness of a critical digital 
system. The resulting assessment satisfied the requirements of Advisory 
Circular 25.1309-1 (Ref. 1), and is consistent with the validation 
requirements of RTCA Document DO-178 (Ref. 2). 

The digital system used in the analysis was the Redundsnt Digital 
Flight Control System (RDFCS) procured Jointly by the FAA and NASA-Ames 
Research Center in 1979. The RDFCS facility is located at NASA-Ames as a 
central part of the Digital Flight Control Systems Verification Laboratory, 
a unique facility for research into the assurance issues of digital 
systems. Volume II of this report describes the RDFCS as it would be in a 
production configuration, including sensors and servos. The sensors and 
servos are not production-configuration equipment, and in fact, they are 
simulated in the RDFCS. 

The assessment consisted of the following major tasks: 

o Application of fault tree analysis, starting at the highest 
system functional level, proceeding to tue hardware circuit card 
level, and to the module level for the processors. 

o Development of a representative set of failure rates for the 
relevant hardware items. 

o Application of an automated reliability prediction program, 
CARSRA, to the system failure modes affecting airworthiness. 

o Application of failure mode and effect analysis to integrated 
circuit pin faults of three processor modules. 

o Definition of faults to be inserted in the RDFCS tc determine the 
effect of the fault when analysis wi.s not feasible, and of other 
faults to confirm the manual analysis. These faults were 
subsequently Inserted and the effects recorded. 

Among the conclusions and observations resulting from this study are 

that: 
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The integrated approach used he r e is apable, with dll ' gent 
application, of establishing the airworthiness of a Digital 
Flight Control System (DFCS) within the context of AC 25.1?09-1. 
Specifically, this approach addresses those system aspects shoun 
in Table 1, including freedom from single-point failure modes and 
system failure probability. 

The integrated assurance approach used in this study should be 
considered for use in validating other digital systems. Including 
DFCS, in compliance with AC 25. i 309-1. 

The quantitative assessment of system failure probability by two 
methods (fault tree analysis and analytical reliability pre- 
diction) offers Increased assurance that the system meets the 
quantitative requirements of AC 25.1309-1. For a flight-critical 
system, this requirement is that the system failure probability 
not exceed 1 x KT 9 per hour of flight for each critical function 
the system performs. 

Fault insertion confirms that the fault detection capability and 
the fault tolerance capability described in the system documen- 
tation are actually implemented in the system. Since the fault 
tree analysis is based largely on the system response to faults 
as described in the system documentation, the fault insertion 
confirms that the fault tree analysis correctly reflects the 
behavior of the actual system in the presence of faults. 

The fault tree analysis generates software test requirements in 
terms of functions which the software must perform. These, 
in turn, provide a check of function criticality and of test 
requirements generated in accordance with RTCA Docwnent DO-178. 

Fault tree analysis proved unwieldy below the circuit card level, 
because at lower levels many more functions are being performed 
than there are hardware failure modes. Failure mode and effect 
analysis was accomplished successfully at the Integrated circuit 
pin level. 

As a training facility and a Reconfigurable Test Bed, the RDFCS 
facility has significant and valuable capabilities for 
investigating assurance issues of currently definable DFCS 
architectures. It also has potential enhanced capability In 
certain areas, such as automated insertion of pin-level faults, 
for confirmation of analytically determined failure effects. 

The comparison of the time or cost required for the integrated 
approach re. ted here with that required for other possible 
assurance approaches was net specifically addressed in this 
study. However, the time required for the integrated approach is 
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SYSTEM FAILURE RELIAB7L T TY PRE- FAULT TREE ANALYSIS 

PROBABILITY DICTU /. PROOF \M QUANTITATIVE EVALUATION 


expected to compare favorably with that for other approaches, 
assigning the same depth of analysis. The cost should also 

compare favorably, provided a facility suitable for fault 
insertion is available. 
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2, tmtcmta mp scope 


oejicrms 


Thi primary objective of this contract was to explore and demonstrate 
the integrated application of reliability, failure effects, and ayatea 
simulator methods in establishing the airworthiness of a fli'ht-crltioal 
digital flight control system. The emphasis was on the mutual 
reinforcement of the methods, with results oriented toward inclusion In an 
FAA Data Base. 

SCOPE 


The scope of the effort was primarily limited to assessment of the 
RDFCS in the automatic landing maneuver under Category Ilia conditions as 
defined in AC 120-28C (Ref. 3). Application of methods below the system 
level was or a selective basis and focused within the digital portions of 
the system. Installation-dependent effects, such as failure of RDFCS 
components induced by failure of components in other systems, were not 
considered. 
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corresponding analytical description shall be prepared as necessary to 

perforn the Integrated assessment. This description may Include existing \ 

documentation for the RDFCS, and as necessary, it shall include additional 

components (e.g., secondary flight control) needed to reflect a realistic 

DFCS. 

FAULT TEES ANALYSIS 

A fault tree analysis beginning at the system level is required. The 
analysis shall be extended the integrated circuit pin level for at least 
three digital modules. 

FAILURE RATES 

a 

A set of representative failure rates for the components and parts of 
the RDFCS shall be developed as necessary to evaluate the fault tree for 

i 

failu. i probability. 

FAULT SIMULATION CASES j 

s 

A number of simulated fault conditions shall be defined for insertion 
in the RDFCS simulator. These faults shall be for two purposes: to \ 

confirm the assumptions underlying the fault tree analysis, and to resolve j 

uncertainty of the effect of the fault when analysis is not tractable. 

i 

FLIGHT CASE TRANSITIONS 1 

■' 1 1 ” — J 

i 

I 

I 

A go-around flight case shall be installed on the RDFCS simulator, and j 

transition capability 3hall be Installed to transition the airplane from j 

approach to landing and landinc to go-around flight caues. j 
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CASSIA RELIABILITY PROGRAM 


The CARSRA reliability program shall be applied to the RDFCS. The 
application shall be made In such a way as to be instructive for future 
applications of CARSRA to other system. 
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HDTCS 

The RDFCS is described in considerable detail in Volume II of this 
report. The description presented here summarises the system architecture. 
In nost operational modes, the system is fail passive, with a dual channel 
configuration. For automatic landings under Category Ilia conditions, the 
system oan be brought into a dual-dual fail-operational , fail-passive 
configuration. The classification dual-dual relates primarily to the four 
computer channels in the system. Each of the two flight control computers 
(FCC) has two channels which run frsae-synchronously, with eaoh chsnnel 
driving one coil of a dual-coil servo in each axis. Any indication of 
disagreement between the two channels in an FCC causes the servo connected 
to that FCC to be disengaged oy removing hydraulic pressure. Figure 1 
summarizes the dualnSusl configuration. 

M onitoring Configuration and Implementations 

Extensive monitoring is employed in the RDFCS for fsult detection. 
Coil current comparators for eech servo provide coverage of faults 
resulting in erroneous commands to the servo coils. They also provide 
coverage for broken wire faults between the FCC and the servo or failu.es 
of the colls themselves. These monitors, which are described in Volume II, 
Sections 5. 1.1. 6. 2 through 5. 1.1. 6. 5, are made more effective by the 
insertion of opposing 5 ma bias currents. The bias currents permit circuit 
integrity to be monitored even when the FCC is not commanding the servo to 
a new position, such as when the aircraft is flying through very calm air 
at a stable attitude. It may be noted that this type of monitoring is 
equally applicable to analog and digital systems. 

Response of the autopilot servos to commands from the servo amplifiers 
is monitored by modulator piston position signals fed back to the FCC (Vol. 
II, Sections 5. 1.1.6. 3 through 5.1. 1.6.5). The feedback signals are 
averaged and passed through a high-pass filter to get a modulator rate tnat 
is compared with coil current. This comparison is used to detect Jamming 
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of th# modulator piston, runaway conditions, or loss of hydraullo powar. 
This type of monitoring also can be applied to either analog cr digital 
systems . 

In the pitch-axis servos, modulator piston position monitoring is 
Implemented in hardware. In the other two axes, it Is implemented in 
software. Together, the eoil current monitoring aid modulator piaton 
monitoring detect any servo fault which prevents the sarvo from responding 
to commands. They also detect any fault in a ooaputer channel which 

prevents that channel from generating a reasonable command for tha servos 
in each of the three axes. All monitors and feedback sensors are dual to 
increase reliability. 

Each computer channel has an Iteration monitor implemented in hardware 
(Vol. II, Figures 5. 1.2. 1.2 through 5. 1.2. 1.3). This monitor observes the 
state of a discrete software variable which is changed at the end of each 
iteration of the foreground software. Since this software exeoutes at a 20 
Hz rate, the result is a 10 Hz square wave. Should the processor 

short-loop or hang up, the 10 Hz wave will not be presented and the 
iteration monitor will withdraw its input to the engage logic and the FCC 
will disengsge. 

Sensor monitoring is primarily accomplished by comparison and by 
validity discretes generated by the sensors (Vol. II, Sec. 5. 1.2. 4 through 
5. 1.2.8). There is no one place thst sensor monitoring takes place, since 
all four computer channels incorporate the monitoring function. This 
ensures that the circuitry involved in getting the sensor signals to each 
channel is included in the monitoring. 

The gyro snd accelerometer discretes are generated as dascribed in 
Voluae II, Sections 5.11 through 5.12. The accelerometers are tested as 
described in Section 5.11 each time the system is powered up with the 
airplane on the ground. 

The ILS receivers are checked using the square wave test of Volime II, 
Section 5. 1.2. 3. 1. 1.5. This test checks for failure of the localizer s.nd 
glideslope beam deviation inputs. During landing, the outputs of both 
receivers are compared, with reliance on the self-monitoring to identify 
which receiver is bad if the signals disagree. The comparison monitoring 
is used to check wire integrity between the receiver and the computer 
channels. The other dual sensors are comparison monitored in the same way. 
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Even though each channel monitors sensors individually, any channel 
can Initiate the NO DUAL annunciation, which is the priaary indication that 
the aystea is not fail-operational . If any channel detects a second 
failure of a sensor type, it will cause its FCC to disengage, but the other 
FCC will remain engaged. 

Although NO DUAL is the primary warning of loss of one sensor, NO 
ALIGN will be annunciated if the course signals from the two oompaas 
systems do not agree. 

Other monitoring within the FCC involves comparison of active 
operating modes. If the two channels within an FCC disagree on whioh modes 
are engaged, and the disagreement lasts for mere than 0.1 sec, the FCC will 
disengage. If the two FCC's disagree, SPLIT will be displayed on the 
Warning Annunciator Indicators. This monitoring, together with the sensor 
data transfers, will detect most faults of the cross-channel data transfer 
circuitry. 

SIMULATOR DESCRIPTION 

The RDFCS simulator is comprised primarily of the RDFCS pallet, shown 
in Figure 2, and a POP 11/60 computer. The RDFCS pallet includes the 
Flight Control Computers (FCC), core memory, Modular Digital Interface 
Control Unit (MDICU), Servo Simulator Panel (SSP), Discrete Switch Panel 
(DSP), CAPS Test Adapters (CTA), and Computer Breakout Panels. The 
functions of these items are described in the remainder of this section. 


outer /Airplane Model 


The PDP 11/60 computer hosts a discrete-state model of the airplane in 
which the RDFCS is installed. This airplane is a representative wide-body 
transport, and the model coefficients are changed according to flight case 
being simulated. Each flight case, then, is a point, simulation of the 
airplane in a particular configuration and operating in a specific portion 
of the flight envelope. The airplane model executes at a 50 Hz rate. 

As part of this study, a go-around case was added to the library of 
cases available. These cases are described and discussed in Reference 4. 
The go-around case is characterized as follows: 




I 
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Airplane Weight 
Altitude 
Angle of Attack 
Indicated Air speed 
Flap Deployment 
Center of Gravity 


314.500 lb 
35 ft 
10.91° 

168 kts 
22 ° 

255 of c 


Transition capability was added to go from approach conditions to 
landing conditions, and from landing to the new go-around oase. The 
transitions involve changing the laodel coefficients and establishing new 
trla values. The transition oapability has been installed and oheoked out 
sueoessfully. 

Modular Digital Interface Control Unit 


The Modular Digital Interface Control Unit (MDICU) receives the output 
of the airplane discrete-state model through a communication link with the 
PDF 11/60 computer. The MDICU converts the various pieces of information 
into the form needed by the FCCs. For example, roll angle and pitch angle 
are converted to three-wire AC signals, properly scaled, while localizer 
deviation Is coded in ARINC serial digital format. The MDICU is described 
more fully in Reference 5. 

The MDICU incorporates provisions for the signal for the No. 1 sensor 
of each type to be ramped up or down. This facility is accessed by means 
of the HP 2645A terminal physically located in the pallet. 



ir Breakout Panels 


Each sensor signal going from the MDICU to the FCC's can be 
Interrupted at the Computer Breakout Panels by removing the appropriate 
jumper plug. Every FCC back connector pin is routed through one of these 
plugs. The lower portion of Figure 3 shows the rows of plugs for connector 
PI and the "A" half of connector P2. Each FCC has its own breakout panel. 
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CAPS Test Adapters 

Figure 3 also shows the CAPS Test Adapter (CTA) for one of the FCC's. 
The upper half of the CTA Includes, on the right-hand si<H. four address 
and four data windows. An address can be loaded in eacl address window, 
and the corresponding data window used to display the data on the FCC A- 
side processor bus data lines every time the address appears on the address 
lines. The CTA also has other capabilities, such as providing a history of 
the last 16 bus transfers and changing the contents of a specific memory 
location within the FCC, but during the study only the address monitoring 
was used. Discrete variaoles representing sensor voter status were 
monitored visually via the data windows. Continuous variables, sueh as 
inputs to *he servo amplifiers, were monitored by using the analog output 
posts belo the appropriate data window to drive a strip-chart recorder. 

The ower half of the CTA performs the same functions as the 
upper half, but for the B side of the FCC. 

Servo Siaailator Panel 


The servo amplifier outputs from the FCC's are routed to the Servo 
Simulator Panel (SSP), shown in Figure 4. Tha SSP slmuluces the dynamics 
of the autopilot and power servos, and generates the required feedback 
signals such as modulator piston position. The SSP has circuits which can 
simulate a handover or slowover command to a servo coil. It can also 
simulate a hardover or slowover of a modulator piston, including the 
modulator piston position feedback signal and the command to the power 
servo. All of these apply to the No. 1 servo of each type. 

Discrete Switch Panel 


The Discrete Switch Panel (DSP), Figure 5, is levied Just below the 
SSP. This panel provides a centralized location for switches such as 
hydraulic pressure switches and autopilot disconnect switches. The panel 
also includes switches that can be used to insert senoor validity faults. 
These faults can also be inserted by pulling the appropriate jumper plug on 
the FCC Breakout Panel. 
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Core Hem ory 


The pallet also contains core memory for the FCC's. This is used for 
both data and program memory to provide flexibility and convenience in 
using the pallet to simulate other airplanes or DFCS architectures. As 

used in in airplane, the FCC's have the- flight software stored in 
programmable read-only memory (PRON) and use random access memory (RAM) 
chips for data memory. 

Glare-Shield Panel 


The pallet also has a glare-shield panel, which is the control panel 
for the system as installed in an airplane. It includes the engage (bat 
handle) switches, mode select switches, altitude select knob, and other 
controls. The pallet also has a single ADI, HSI, radio altitude display, 
Moce Indicator, and Warning Annunciator Indicator. 
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FAULT TW MU I» UnTOIATlD A3801AICI 

The Integrated assurance assessment of the REPCS begins with • fault 
tree anal; > < of the system function. Referring back to Table 1, the fault 
tree anal; wit has several functions. The first funotioa la to samara that 
no system component has any failure mode which can result in system 
failure. Host of the components, such as the sensors and servos, have only 
a few failure modes which can be observed at the interfaces with the rest 
of the system. For these components, the fault tree analysis provides 
assurance that no failure modes can cause system failure. The assurance is 
obtained by reviewing the completed tree and determining that system 
failure can only occur as a result of multiple failures. 

In general, digital modules (and therefore digital components) can 
have a substantial number of different failure modes. In such eases, it 
becomes quite laborious to continue the fault tree development to a level 
of detail sufficient to confirm that none of those failure modes can cause 
system failure. The second function of fault tree analysis is to identify 
which digital modules are involved in performing critical functions. The 
task of assuring that no single module level failure can cause system 
failure is performed with failure mode and effect analysis (FMEA) . 

A major benefit of fault tree analysis is that it focuses on the 
functions performed *y the system elements, including those system elements 
involved in detecting faults and providing appropriate annunciation to the 
flight crew. Consequently, the third function of fault tree analysis is *o 
confirm the adequacy of monitoring (i.e., fault detection and annunciation) 
in the system. 

Fault tree anal yl sis is also used to identify specific software 
functions required for system operation, inc'uding fault monitoring 
implemented in software. The software test requirements for these 
functions are then specifically reviewed to confirm that these requirements 
are adequate. This fourth function of fault trees is discussed more fully 
and illustrated subsequently as the tree for the REfCS is developed. 
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The fifth function of fault trot analysis is to provide an altarnata j 

means of computing the probability of system failure. This provides a 1 

check of the probability obtained from the CARSRA program to tnsure that 
the CARSRA input does not have errors which would produce a false low 
probability of system failure. 

FAULT Tig MnULOWPT 

The fault tree analysis is based on the undesired event that the 
airplane has an unacceptable deviation from the desired flight profile 
during the last ISO feet of descent while executing an automatic landing, . ] 

as shown in Figure 6. This portion of flight, which is the only flight 
phase during which the RDFCS performs a critical funotion, is termed the 
"crucial flight phase" in this report. Category Ilia conditions are 
assumed, so thrt the human pilot cannot complete the landing using visual 
cues should the RDFCS fail. j 

The analysis begins with the RDFCS in the dual-dual configuration. It 
should be noted that this configuration is available only after the 
Instrument Landing System (ILS) push-button has been used to select the 
Approach/Land (A/L) mode (Ref. Vol. II, Section A. 3.6.1). After this 
switch has been momentarily depressed, the A/L mode is transmitted to the 
FCC's and latched in. The switch is no longer needed, and therefc/e does 
not enter into the analysis. 

The top event of Figure 6 can be caused by any of three conditions, or 
subevents. For convenience, these can be referred to as Level-2 events, 
with the top event considered to be at Level 1. The Level-2 events are 
shown as the middle row in Figure 6. The first of these is that the system 
design is in some manner deficient for the environmental conditions 
encountered. This includes the possibility that the conditions encountered 
are outside of the system design requirements; it also Includes the 
possibility that the control laws are deficient for some conditions which 
may be expected. This possibility is outside the scope of this project and 
is not pursued here. References 6 and 7 address this subject. In parti- 
cular, Section 3.3. 1.3 of Reference 6 discusses establishing an upper bound 
on the probability of a deficient control law by statistical methods. 
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The second of the Level-2 events occurs if tho airplane enters the 
crucial phrse with the RDFCS not fail-operational, and then a component 
failure occurs which prevents the system from completing the landing. 

The third of the Level-2 events is that the crucial phase i* entered 
with a fail-operational RDFCS, but multiple component failures occur before 
the end of the phase, and these failures result in RDFCS system failure. 

The second of the Level-2 events, that the crucial phase is initiated 
without fail-operational capability, is expanded into three relevant 
functional areas, or Level-3 events: sensing aircraft attitude and 

position, computation of required outputs, and serve response to computed 
commands. The first of these, the sensing function, is expanded in Figure 
7 into the various parameters needed by the FCC's in the automatic landing 
control laws. At this and higher levels, the fault tree is functionally 
oriented: failures are in terms of loss of function rather than loss of 

hardware. 

The fault tree stub of Figure 8 extends the sensing function for 
normal acceleration to the individual hardware elements used to measure the 
acceleration and transmit it to the computers. The failure of the normal 
acceleration signal ho. 1 to bt present in all computer channels can be 
caused by loss of the sensor itself, associated wiring, or one of the 
circuit cards involved in receiving the signal and transmitting it to all 
channels. Yolune II, Figure 5. 1.1. 3.1 shows the functional flow of these 
cards. The A2A Autoland Sensor Input and A27 Discrete Input Cards are both 
Involved: The A2A card handles the analog acceleration signal and the A27 

card handles the validity discrete signal. The processor itself is not 
involved in the data acquisition process and so is not shown. At this 
level, the tranrltion has been made from required funcitons to the hardware 
which performs those functions. 

Failure of the system to provide a NO DUAL annunciation is shown in 
Figure 9. This figure is of particular interest because of the explicit 
software function identified. A failure rate of zero is assigned to 
failure cf this function, because it can be explicitly and exhaustively 
tested. Once it has been so tested, the probability of both NO DUAL 
annunciations failing because of a generic software error is taken to be 
zero. A generic software error is a discrepancy in the software which will 
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i q«um all computer channels which use that software to produce the same, 

but wrong, result. Multiple computer ohannels do not provide redundancy 
1 with respect to generic software errors as long as the same software la 

\ used in all channels, as it is in most contemporary systems, lnoluding the 

RDFCS. Reference 7 may be oonaulted for a discussion of software errors, 
and RTCA Document DO-178 should be consulted for a discussion of software 
test requirements. 

Fault tree stubs similar to that shown in Figure 8 were developed for 
the other sensors of Figure 7. These are very much like the stub shown in 
Figure 8 and so are not Included in the report. 

The second of the Level-3 events of Figure 6 is that the orueial 
flight phase is initiated without fall-operational ooaputlng capability and 
that an additional component failure causes system failure before the phase 
is complete. This is shown in Figure 10 as four Level-4 events. The first 
of these, that channel A of FCC No. 1 fails above alert height, can be 
caused by either channel of the FCC failing to produoe a required output, 
as shown by the eight events at the lowest level (Level-5) in Figure 10. 

Figure 11 continues the development of the fault tree for one of the 
Level-5 events of Figure 10. This event, failure of the A channel of FCC 
No. 1 to produce a rudder command, can be caused by failure of any one of 
several cards within the channel. In this study, the two cards which make 
up the processor were considered in more depth than the others. These two, 
the A 1 3 Control Card and the A14 Data Path Card, are shown in Figures 12 
and 13, respectively, in terms of the modules described in Section 5. 1.1.1, 
Volune II. Also shown in each of Figures 12 and 13 is a subevent for 
failure of a miscellaneous part, such as the circuit board, the edge 
connector, or other part which is not Included in one of the modules named 
in the other blocks. 

Theoretically, the fault tree ar.-'ysis of the failure of the processor 
to compute the rudder command can be continued below the module level to 
the individual integrated circuit pins o»- discrete piece-parts. The 
desirability of doing this is questionable, however, because of the nature 
of the processor. The processor is not designed to perform a single 
specific func*<on, such as computing rudder commands. It is designed to 
efficiently perform a nunber of simple functions, such as addition, 
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multiplication, and logic operations. A suitable sequence of auch 
operations (i.e., the flight software) is used to make the processor 
generate the rudder command, the aileron command, and ao forth. It is auoh 
easier to relate the modules and integrated circuits (IC) to the simple 
functions (add, multiply, etc.) than to the much more complicated functions 
of computing the command for a particular servo. 

It is also easier, in general, to relate a specific failure mode of an 
integrated circuit within the processor to its effect on the processor 
operation than to start with the effeot and then work in the other 
direction to the IC failure modes which would produ* t the effect. In other 
words, it is easier to do an FMEA than a fault tree analysis at this level. 

Another reason for preferring FMEA to fault trees at this level is 
that in the course of performing the fault tree analysis, the analyst must 
account for all of the ways the processor can fail; that is, all of the 
ways in which the processor output can be wrong. 

These ways are the failure modes of the processor. Each of these 
modes must then be traced to all possible combinations of IC pin failures 
which could produce the processor failure mode. Because processors have 
many different possible outputs, there are a high number of ways that the 
output could be wrong. There is no practical way of assuring that all of 
these possibilities have actually been covered in the fault tree. The FMEA 
require i that all pin-level IC failure modes be considered. These modes 
are much better understood, and there are less of them, so that it is much 
easier to be certain that they have all been covered. This is not meant to 
imply that a complete pin-level FMEA is easy or inexpensive; it is neither. 

In light of the foregoing considerations, the fault tree analysis of 
the processor was not continued below the level developed in Figures 12 and 
13. Instead, the FMEA approach was used as described in Section 6. 

To continue with the development of other branches of the fault tree. 
Figure 14 develops the event of Figure 11 that the pilot is not warned that 
FCC No. 1 A channel is not generating a correct rudder caamand. This 
portion of tne fault tree includes several software functions. In a 
production program, the test requirements of each of these functions should 
be reviewed to confirm that they satisfy the criteria of RTCA Document 
DO-178 (Reference 2). In this project, conducted for illustrative 
purposes, this review was not made. 
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Similar tree stubs to that developed in Figures 11—14 were developed 
for the other required outputs from Channel A of FCC No. 1 and the other 
three channels (Figure 9). They are not Included here because they are 
quite repetitive of the analysis shown. 

The last of the Level-3 events of Figure 6 is that the crucial phase 
is initiated without fail-operational servo capability and a debilitating 
failure occurs. This is expanded in Figure 15 into the three siroraft 
control axes: roll, pitch, and yaw. Figure 16 shows the fault tree for 
failure of the No. 1 yaw autopilot servo, with the servo failure not 
annunciated to the crew. 

Fault tree stubs for the other 5 servos of Figure 15 were developed to 
complete the analysis of the Level-3 events of Figure 6. These are quite 
similar to the stub shown for the rudder servo and are not included in the 
report. This completes the discussion of the second of the Level-2 events 
of Figure 6. 

The third of the Level-2 events of Figure 6 is that multiple failures 
occur during the crucial flight phase and these occur in a combination 
which causes system failure. Figure 17 shows the Initial development of 
this event to lower levels. Continuing this development produces a major 
branch of the fault tree quite similar but simpler to that for the second 
of the Level-2 events. It differs primarily in that the NO DUAL 
annunciation does not appear, since that particular warning is suppressed 
during the crucial phase. Since that major branch is so similar to that 
already discussed, it is not describedd further here. 

QUAMTITATIYI FAULT THEE ANALYSIS 


System failure probability was computed from the fault tree using the 
hardware failure rates presented in Section 8. A failure rate of zero was 
used for e«ch software function, since there is currently no acceptable way 
of predicting DFCS software failure rates (Reference 2, Section 2.2.1). 

Considering hardware failure modes only, the probability of initiating 
the crucial phase with less than fail-operational capability and a second 

« u 

failure debilitating the system was calculated to be 2. 46 x 10” . This is 
based on a flight time of 4.0 hours prior to the orucial phase and a 
crucial phase duration of 0.02 hours. 
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The probability of the system falling because of multiple failures 

-Q 

during the crucial phase was calculated to be 0.638 x 10 . This is based 

on a crucial phase duration of 0.02 hours. 

The system failure probabilities computed are actually upper bounds on 
the actual failure probabilities. This is because the fault trees are 
based on the assumption, for many items, that all failure modes of the item 
render the item Incapable of perfonaing any of its functions. For example, 
certain buffers on the A26 Data Acquisition Card are used for sensor data 
which is not required for automatic landing,' apd so at least some of the 
failures of these buffers would not prevent the card from correctly 
handling required data. However, the failure rates used in the analysis 
are for the entire card, including these buffers, so that the failure 
probability calculated for the card includes card failure modes which would 
not affect automatic landing. 


TABLE 2. QUANTITATIVE RESULTS 



Fault 

Tree 

CARSRA 

Probability Of 

Result 

Result 

Unannunciated Failure 
in Cruise and Second 
Failure in Landing 

2.46 x 10 ^ 

3.36 x 10‘ 

Multiple Failures 
In Landing 

0.64 x 10" 9 

0.66 x 10 
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6, FAILUtl WOOf AMP PFICT AMALTSIS 


ton If IMTEOIATID A3SP1A1CT 

As stated in Ssction 5, fault tree analysis provides assurance that 
aoat systea coaponents, such as analog sensors and servos, have no single 
failure aode whloh produces systea failure. This Is because such 
coaponents have only a few possible failure nodes, and it frequently is not 
necessary to distinguish in the fault tree among these nodes. When it is 
necessary to distinguish among modes, it is usually fairly simple to 
identify the aodes which are relevant in the branch of the tree being 
developed. The analysis can often be extended below the component level to 
the failure aodes of the individual plece>parts which coaprise the 
ooaponent. Analysis to this very detailed level is sometimes necessary to 
ascertain that a component has no failure modes which could raaain 
undetected until a second failure oecurs elsewhere in the systea. 

Fault tree analysis is cumbersome and inefficient if extended from 
system level to the integrated circuit pin level in the processor of a 
digital systea, however. Basically, this is a result of two basic 
characteristics of digital systems: 

1. Functions which are described very simply at a higher level 
(e.g., sensor monitoring) require a myriad of sequential 
operations at the integrated circuit level. These operations are 
required tc obtain the proper data, route it to the proper 
registers within the arlthsetic logic unit (ALU) where arithmetic 
and logic operations are actually performed, and route the 
results too the proper storage register or output port. Many 
different integrated circuits are Involved in each of these 
operations. 

2. Many interfaces between integrated circuits involve several 
pins .and it is the combination of pin states (electrically high 
or low) which is significant. That is, each combination of pin 
states represents a different data value or instruction, and the 
effect of a single pin being in the wrong (faulted) state depends 
on the state of the other (non-faulted) pins. 
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The net result of these characteristics of digital hardware is that 
there are many more integratod-circult-level operations performed in 
executing the flight software than there are pin-level failure modes. Irt 
extending e fault tree analysis from failure of system>level functions to 
failure of Integrated circuit pins, all of these detailed operations must 
be included and accounted for, an extremely inefficient process. Once the 
fault tree wad been fully developed, another extremely laborious task would 
remain: reviewing the tree to make certain (1) that all of the failure 
modes of the Integrated circuits had been accounted for, and that no 
failure mode could remain undetected until a second failure ocourred, with 
the combined effeot of both faults rroduolng a hazardous condition; and (2) 
that no failure mode could by itself produce a hazardous condition. 

Failure mode and effect analysis provides a means of systamatleally 
exmaining all of the potential failure modes of the Integrated cirouita to 
confirm that none of them could cause a hazard directly or remain latent 
and subsequently cause a hazard in conjunction with a second failure. 

GENERAL CONSIDERATIONS 


In conducting the pin-level failure mode and effect analysis of a 
processor, three factors greatly reduce the effort. The first factor is 
that propagation of most faults under all conditions does not have to be 
considered. A single effect can usually be found which wixl totally 
debilitate the processor. For example, a faulted processor output pin will 
result in the processor trying to read about half of th® data' and maohine 
level instructions from the wrong memory addresses. This will result in 
tt* coil current comparators tripping, sensor comparisons falling, and in 
the case of the RDFCS, the iteration monitor will fail. In a system using 
check-suns to monitor program memory these tests will fall. 

The second factor which reduces the effort is that many pairs of 
faults will have the same effect. There are niaerous instances of an 
output pin on one IC being connected only to one other pin. If either pin 
falls open, the effect will be the same. Similarly, a ground fault in 
either pin will produce the same effect. 
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The third factor which reduoes the effort is that there are uny 
instances in which three pins are connected so that one output pin drives 
two input pins on different circuits. An open fault at each of the input 
pins csn be evalusted first. An open fault at the output pin is then 
equivalent to both input pins failing open simultaneously, and in most 
oases the effect is the "sum" of the effects of the input pins falling 
open; that is, both effects occur. If both input pins are on the same 
chip, the effect of both being open is more likely to differ from the sum 
of the individual effects. See Figure 18. 

The effect of any of the three pins failing shorted to ground is the 
same in either of the two case3 of Figure 18. 

Another frequently encountered condition involving three pins is two 
outputs connected to a single input (Figure 19). In such a oase, ohlps A 
and B will have three-state outputs, and one or both outputs should bo in 
the hlgh-impedenoe state at all times. An open fault on the output pin of 
chip A will then only affect chip C when A has its output enabled. Simi- 
larly, an open fault on the output pin of chip B will only affeot ohip C 
when B has its output enabled. An open fault on the chip C input pin will 
usualy produce the sum of the effects of open faults on the two output 
pins. A ground fault on any of the three pins will have the same effect. 

Still referring to Figure 19, if a fault should occur which results in 
both enable pins being in the enable state, there is a possibility of 
damage to the A or B chip. If one output is high and the other low, there 
could be a low impedance path to ground, through the output pins, which 
could burn out the A or B chip. This depends on the technology used in the 
individual chips. Frequently, the effect of the original ground fault oan 
be Judged to be a total processor failure whether or not the secondary 
dmaage occurs . 

APPLICATION OF EDFCS 


In this study, three modules of the processor (Figure 20) were 
considered at pin level (Ref. Vo*. *1, Section 5. 1.1.1): 

o The instruction mapper prom, which consists of three prom ohlps 
in parallel 


43 



IN-2 


BOTH A AMD "B" ENABLED SDCTLTAMEOCSLf 
HAY DAMAGE CHIP. 



Figure 19. Two Output, One input Condition 




FIGURE 20. PROCESSOR BLOCK D 













o The microprogram sequencer, which consists of three 2911 
sequencer chips In parallel 

o The microprocessor module, which consists of 4 chips in parallel. 
Each of these chips is a 2901A. 

The instruction mac'ter proa chips are read-only memory chips. The 
Inputs to the chip are maahlne-level operation codes and the depth of the 
stack maintained in the 2901 microprocessors. These are connected to the 
address pins of the mapper. The data stored in the prom is the control 
store proa address of the first microcode instruction required to execute 
the machine level instruction with the processor stack at a particular 
depth. The mapper output pins are only active at the beginning of a 
microcode sequence, at which time a chip enable signal is sent to the 
mapper from the next address control proa. 

The microcode address from the mapper proa is routed to the 
microprogram sequencer module. This module generates a sequence of 

microcode addresses, beginning with the starting address from the mapper 
prv*d. Some microcode routines involve jumps to a new address rather than 
sequential progression only. In such cases, the microprogram sequencer 
receives the junp address from the control store proas and resuaes 
sequential generation of addresses. 

The microprocessor module is composed of four 290 1A microprocessor 
chips. Each chip has a word size of 4 bits, so that the four chips in 
parallel are used to provide the processor 16-bit word size. This requires 
that carry signals be passed between 2901A's during arithmetic operations. 
Other interconnections between 2901A's are used for data shift operations. 

The 2901A's are controlled primarily by control signals from the 
control store proas in conjunction with the outputs from various registers. 
Section 5. 1.1.1 of Volume II should be consulted for further information on 
the functions of these registers and other processor modules. 

The failure mode and effect analysis, summarized ir. Table 3, (in 
Appendix A) considered three types of pin-level faults: open, grounded, 

and shorted to supply voltage. In most cases, the effect of a fault can be 
assessed by using the chip logic diagrams, a description of ehlp/module 
functions and the schematic diagrams (Volume II, Sections 5. 1,1.1. - 

5. 1.1.5). The schematic diagrams are reproduced in Appendix C. 
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The effect of certain pin faults cannot be determined by analysis 
using just the information mentioned above. In particular, the contents of 
specific prom addresses is needed in some cases. In other oases the 
machine-level code is needed along with the microcode sequences and 
addresses. Alternatively, the faults can be Inserted and the effect 
observed. This approach was taken in this study and the results are 
presented in Section 7. For example, it was known that failure of one of 
the processor pins used in data shifts (RO, R3, QO, Q3 stuck high or low), 
there would be an iamediate disconnect if certain of the integer words made 
up of packed Boolean variables were shifted. It was determinable from the 
available information that such shifts might occur, but it was not 
determinable that they definitely would occur. Volume II, Tables 
5. 1.4.3. 3 . 3 and 5. 1.4. 3. 3. 4 show examples of such packed words. Similarly, 
if certain fixed-point numbers were shifted during computation, the 
consands to the servos would be in error and the coil current comparators 
would trip. While both left and right-shifts are normally used in 
multiplication algorithms, it was not determinable that a stuck shift bit 
would definitely cause such a trip. When the faults were actually 
inserted, the processor stopped immediately. ("Lmsediately ," as viewed by 
the human observers.) In this way, fault insertion confirmed the overall 
effect, massive processor failure and disengagement of the servos, but the 
exact mechanism by which it occurred was not determined. 
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7 . fault nscanai 


1011 CT CTTIOimP APMOACH 


Fault insert ion is used in the integrated assurance approeoh for three 
purposes as shown in Table 1. These are: 

1. Faults are inserted, on a sampling basis, to oonfira the fault 
effects reflected in the fault tree analysis and fault effeota 
determined during failure node and effect analysis. This includes 
faults of ooaponents (sensors and servos in this study) and faults 
of integrated circuits (pin-level faults In the sigitai proces- 
sor). 

2. Faults are inserted, also on s ssapling basis, to oonfira fault 
detection and annunciation functions lapleaentsd in the systea. 
Many of these are also inserted to oonfira effeots, so that they 
ere inserted for two speeiflo purposes. 

3. Faults are inserted to deteraine the effect when the analysis is 
intractable or when there is scats uncertainty in the analysis 
result. 

APPLICATION TO MFCS 


The MFCS simulator at HASA-Ames waa used to Insert the faults shown 
in Table *1 (in Appendix E). The fnults were of two general types: 
component level faults and integrated circuit pin faults. The component 
level faults were Inserted using the FCC breakout panels (Figure 21), the 
Servo Simulator Panel (Figure 22), and the MUICU. Single-sensor faults are 
those numbered 1 through 19 in Table A. 

Faults representing a dead sensor or a broken wire from the sensor to 
the FCC were Inserted by pulling the appropriate jiaiper plug at the break- 
out panel. Faults representing missing sensor validity discretes were also 
inserted in this way, although they can also be inserted via the Discrete 
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Switch Panel (Figure 23). Sensor hardovers and ramps were inserted using 
the MDICU. Servo faults were Inserted using the Servo Simulator Panel. 

For 'monitoring the proeessor detection of sensor faults, the CAPS test 
Adapters (CTA) were used. One of the CTA address windows was set to the 
adddress of the Executive Failure (Status) Word (EFW) in eaoh computer 
ohannel. The EFW is a 16-it word with each bit representing a discrete 
piece of information and there is one EFW for each sensor type in each 
computer channel. The 4 low-order bits (0-3) represent respectively 
failure of the Hy A (EFMA) , My B (EFMB) , Other A (EFOA) , and Other B (EFMB) 
sensor signals. The other 12 bits have functions as described in Volune 
II, Table 5. 1.2. 4.2, which are not of concern here. The data window of the 
CTA shows the status of the EFW as four hexadecimal characters, with the 
right-most character representing the bits of interest, 0-3. 

The effect of a sensor signal being detected bad by the software sen- 
sor monitor is that certain bits are changed from 0 to 1. With no failures 
detected, EFMA, EFMB, EFOA, and EFOB are all 0, which is represented in 
hexadecimal notation as 0. (0000 binary = 0 hexadecimal.) When the number 
1 sensor of a triple sensor complement is detected to have failed, bit 0 

(EFMA) is set to 1 in both channels of FCC No. 1. Bit 1 is also set to 1 

* 

so v .hat the comparison monitoring will work properly on the two remaining 
seniors. The EFW low order bits will then be 0011, which is 3 in hexa- 
decimal. The net effect, then, of the number 1 sensor of a triple sensor 
set falling is that the value displayed in the CTA window changes from 0000 
to 0003. The left-most three hexadecimal digits each remains at 0 since 
each of the corresponding binary bits (4-15) of the EFW remains at 0. 

Fault cases 1 through 8 were used to show that the software sensor 
monitor subroutine is implemented correctly in the RDFCS by subjecting it 
to a number of different faults in the same sensor *\ype. These cases were 
also used to show that the results of the sensor monitoring are accounted 
for in the Implementation of the NO DUAL equation, which is also in soft- 
ware. Cases 9 through 16 w-re then used to show that the voter is involved 
for various sensor types. Rigorous validation of the system by testing 
would require that faults be inserted for all sensor ty: *s used in 
automatic landing. In this study, performed for illustrative purposes, the 
full complement of sensor types was not faulted. 
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In case 2E, NO DUAL did not annunciate even though the fault was in- 
serted with the airplane inbound to the ILS beam intercept point. It is 
believed to be the result of the inbound leg being flown at an unrealis- 
tically low altitude, so that the airplane did not traok the glideslope 
beam for 25 seconds before passing through 150 ft altitude, A review 
the NO DUAL annunciation logic (Volume II, Section 5. 1.2.3. 1*3) shows that 
this is the most likely cause, since AP.ONEFAIL was set to true. Low 
approaches (1500 ft) were being simulated in the interest of tine. Approach 
altitude was subsequently raised to 2000 ft. 

Faults 17 through 19 were used to confirm the servo monitoring and the 
tie-in of the servo monitor outputs to the NO DUAL and disconnect logic. 
The servo monitors, in particular the coil current comparators, are quite 
important in ensuring that the airplane does not enter the crucial phase 
with a faulty computer or servo. 

Fault cases 43 through 45 were used to confirm that the FCC’s will 
both disengage upon loss of the second sensor, with the AP.DISC warning 
displayed, in accordance with the system description, Volune II, Section 
4. 3.6. 1. 

At the integrated circuit pin level, a nunber of open and ground 
faults were inserted to confirm the FMEA results of Section 6. For this 
activity, one of the FCC’s was removed from the pallet and the card 
containing the chip to be faulted wa i extended for access as shown in 
Figure 24, Figure 25 shows the processor Data Path card. 

Open pin faults, Cases 20 through 23, were inserted by using multiple 
sockets between the chip and the circuit card, with a jumper wire replacing 
the normal pin-to-socket connection. Each fault was inserted by physically 
pulling the jumper to open the connection. This is a slow procedure, since 
the chip must be removed and the jumper wire rigged on the desired pin. The 
chip and sockets must then be installed and the processors brought back up. 
This means of inserting open pin faults is only marginally satisfactory. 
It would be much easier to do if a stack of 5 or 6 sockets could be used 
between the chip and the circuit card. However, the processor will not 
come up with more than three sockets stacked. The longer electrical paths 
resulting from the use of the extender cad apparently come close to 
exhausting the available tolerance in the timing of the individual micro- 
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steps, an* the extra path length and capacitance caused by more than three 
sockets disables the processor. 

Grounded pin faults are much easier to Insert, since the chip does not 
have to be removed to set up each case. The processor does have to be 
brought baek up each time, but this is a fairly rapid step. Before eaoh 
fault was Inserted, the data sheets from the chip manufacturer were 
reviewed, along with the card schematics, to determine that the fault would 
not damage any chips. No ohips were damaged by the ground faults. The 
gro'jid pin faults are cases 24 through 42 in Table 3* 

The chip pin faults all disabled the processor, with the exception of 
open pin fault 21. This fault involves a pin of a quad 2-input NOR gate. 
The fault had no effect on the processor operation. 

FIULi INSERTION RESULTS 


The faults inserted in the RDFCS simulator achieved the desired re- 
sults in the assurance assessment of this study , and more importantly 
confirmed that fault insertion is capable of providing the results required 
of it in the integrated assurance approach. Specifically, the faults 
inserted confirmed (1) that the NO DUAL warning appears wher. it should, (2) 
that all sensor types faulted and required for automatic landing are 
monitored, (3) that the servo monitoring functions correctly, (4) that the 
effect of pin-level faults in the processor is in agreement with the 
failure mode and effect analysis, and (5) that fault insertion is a 
reasonable way of resolving uncertainty of the effect of open and grounded 
pin faults in digital hardware. While these results were obtained on a 
particular system, the approach is Judged to be viable for validating other 
digital systems. 
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8. FAILURE RATI DEVELOPMENT 


The failure rates for servos, sensors, and Indicators were taken from 
the data base aaintained by the Lockheed-Georgia Company Reliability 
Engineering Departnent. They are composite values for representative 
components of comparable oomplerlty and construction. 

The failure rates for the integrated circuits of the Data Path and 
Control Cards were estimated using the formulas and tables of Military 
Handbook 217C (Ref. 8). The formulas provide a means of accounting for a 
significant number of factors: 

1. Device technology 

2. Device complexity 

3. Junction temperature 

4. Package technology 

5. Appllcaiton environment (voltage) 

6. Usage environment 

7. Quality level 

For example, the equation for the failure rate of a monolithic bipolar 
device is: 

f = K Q [C 1 K t K v + (C 2 ♦ C 3 ) K £ ] K l 

where: 

f is the device failure rate 
Kg is the quality factor 

K.j. is the temperature adjustment factor for junctions 
Ky is the voltage derating stress factor 
K £ is the appllcaiton environment factor 
C 1 and are complexity factors based or transistor count 

is a complexity factor based on package technology and number of 
pins 

K^ is a learning factor. 


58 


'll# e’lfc;. 


The quality factor, Kg, has a value of 1 for devices procured in full 
accordance with MIL-M-38510 (Ref. 9), Class B requirements. This value was 
used for all circuits in this project. It should be noted that the quality 
factor is a direet multiplier, so that the predicted rate i* proportional 
to it. More or less stringent quality factors can therefore greatly 
Influence the prediction for any individual circuit, circuit board, or an 
entire component. 

Junotion temperatures are used in determing the adjustment faotors K^. 
The junction temperature is ambient temperature plus the differential 
resulting from power dissipation through the case. An ambient of 60°C 
wa3 used, with the power dissipation taken from the circuit specification. 

The voltage derating stress factor is 1 for the bipolar circuits used 
in the CAPS processor. Th<- application environment factor is 3.5 for the 
airborne, inhabited, transport environment of the aircraft underdeck 
avionics rack. Failure rates for the circuit cards of the FCC's were 
obtained by summing the failure rates for the card and its components. 
Table 5 summarizes the failure rate prediction for the A13 control oard. 
Failure rates for the other cards are shown in Table 6. 

Table 7 presents failure rates for the system components other than 
the FCC's. 

In using these rates in the fault tree and CARSRA analyses, an 
adjustment was frequently required to include only a portion of the rate, 
4 since only certain failure modes are of Interest. For example, each dual 

1 current comparator has a predicted failure rate of 0.03. Each half of the 

comparator is given a rate of .01 for the failure mode of failing to trip 
when the threshold difference is exceeded. This is a very conservative 
rate for this mode. 


TABLE 5. FCC CONTROL CARO FAILURE RATE 


*A1 1 


ITEM 

Integrated circuits 

Resistors 

Capacitors 

Oscillator 

Coil 

Circuit Board 
Edge Connector 

Control Card 

failure rates in failures 


FAILURE RATE* 

1.788 
.0018 
.214 
.25 
.0007 
.023 
. 16 


2.45 

per million hours. 
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TABLE 6. PREDICTED FCC CARD FAILURE RATES 


CARD NO. 


FAILURE RATE* 


A 1 Power Supply Monitor 0,555 

A2-A5 Proa Card ,809 each 

A6 Power Supply Monitor .55 

A 7 - A10 Prom Card ,809 each 

All Terminator/Tes t Access .555 

A12 RAM Memory Control 1,18 

A l 3 CAPS Control 2.45 

A 1 4 CAPS Data Path 1.98 

A!6 Cross-channel Receiver .70 

A17 D1TS Transmitter 1.75 

A!8 D/A Servo Command 1.75 

A 1 3 Terminator/Time Synch 1.40 

A20 Discrete Output 2.79 

A2 1 Data Transmit ter /Rece iver .70 

A22 Serial Digital Input No. 1 1.65 

A23 Serial Digital Input No. 2 1.80 

A24 Autoland Sensor Input 1.80 

A25 Cruise Sensor Input U12 

A26 Data Acquisition 1.20 

A27 Discrete Input 1.30 

A3 8 Servo Engage Logic 2.61 

A29 Cross Channel XMTR 1.20 

A3Q - A32 Servo Amplifier 3 00 

A33 Speed Servo Amp 1.70 

A300 Speed Command XMTR 1.70 

A400 Power Supply 21.0 

A5C0 Power Supply 21.0 


* A l 1 failure rates in failures per million hours. 
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TABLE 7. FAILURE RATES FOR MAJOR RDFCS COMPONENTS 


COMPONENT 


UNIT FAILURE RATE* 


Pitch Angle Gyro 303 

Roll Angle Gyro 303 

Yaw Rate Gyro 200 

Accelerometer 74 

Radio Altimeter 756 

ILS Receiver 252 

Air Data System 167 

Roll Autopilot Servo 14 

Fitch Autopilot Servo 15 

Yaw Autopilot Servo 14 

EH Valve Drive Coil 1.0 

LVDT .72 

Dual Current Comparator (Hardware) .03 

Warning Annunciator (per function) 8.3 


♦These are NOT actual failure rates for any particular air- 
plane or for any single component produced by a particular 
manufacturer. Th^y are representative rates determined by 
a review of generic component types on a number of airplane 
models in a variety of commercial and military applications. 
All failure rates per million hours. 
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9. RELIABILITY PBgPICTIOtt 03IR0 CARSRA 


CARSRA, which stands for Computer-Aided Redundant System Reliability 
Analysis (Ref. 10), is an analytical reliability prediction program used in 
the integrated assurance approach to obtain the probability of system 
failure. In this study, the probability of failure is only considered dur- 
ing the c-ucial flight phase, which has a duration of 0.02 hours. 

The use of CARSRA, along with the quantitative assessment produced by 
evaluating the fault tree analysl , provides two independent computations 
of system failure probability. This reduces the risk of a false, low 
probability of failure being produced by a single method and the error 
remaining undetected. 

Although CARSRA is identified specifically in the integrated assurance 
approach used in this study, some other method (except fault tree analysis) 
could be used. If an alternate method is used, it should have sufficient 
configuration adaptability to produce the predicted probability of system 
failure without requiring simplifying assumptions which would produce a 
false, low prediction. Manual analysis is a feasible alternative to CARSRA 
for many systems. 

CARSRA APPLICATION 


Configuration Description 

Three levels of organization are implicit in the CARSRA inputs, and 
these levels must be adhered to by the user. At the top level is the 

system, in this case the RDFCS. System failure probabilities constitute 
the primary output provided by CARSRA. The intermediate level is comprised 
of stages. Each stage consists of one or more identical nodules, which are 
ct the lowest level. In the RDFCS, each sensor is a module, and like 
sensors form stages. For example, each of the three normal accelerometers 
(NA) is a module, end the three NA together comprise a stage. 
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Markov Modala 


Harkov models were selected by the CARSRA developers as a major part 
of the program's analytical framework. The following discussion of these 
models includes some material on applying CARSRA to systems other than the 
RDFCS. This material is intended to benefit readers not faallar with the 
rationale of developing the input parameters for Markov models as used in 
CARSRA. 

A Markov model is used to describe the maaber of failed and operating 
modules within each stage. The transition rates from state to state are 
used to CARSRA in computing state occupancy probabilities. A separate 
Markov model is used for each stage. State 1 is the no-failure state in 
each model, and the two states with the highest numbers correspond to stage 
failure. The Model always starts in State 1. For example, a dual stage 
(one of two Identical modules required for the stage to function) might 
have 4 states, as shown in Figure 26. State 1 represents both modules 
working. State 2 represents one module failed and one working, and States 3 
and 4 represent both modules failed. The highest numbered state, 4 in this 
case, represents undetected stage failure, while State 3 represents 

a 

detected failure. Note that State 2 does not distinguish which module has 
failed. 

State transition rates must be supplied to CARSRA by the user. These 
are generally functions of the module failure rates, and possibly other 
parameters. Returning to the example of the dual stage used previously, 
the Markov state diagram would be as in Figure 26. Transition rate f ^ 
rate at which transitions occur from State 1 to State 2. That is, if the 
system is in State 1, the probability that it will transition to State 2 
during a short increment of time dt is f^dt. The other transition rates 
are similarly defined. 

If there is no monitoring or switching required when the first module 
fails, and if there is no possibility e f the stage failing undetected, the 
transition from State 1 will always be to State 2, and the transition from 
State 2 will always be to State 3. Transition rate f,^ will be simply 2f 
and f^, will be f, where f is the failure rate of a single module. The 
other transition rates will be 0 . Note that this means that State 4 will 
never be occupied, consistent with undetected stage failure being irapo.t- 
sifcle. 
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DETECTED UNDETECTED 


Figure 26. Markov Modal of Dual Stag* 




In many instances encountered in real systems, digital or otherwise, a 
reconfiguration must occur before the redundancy can be availed, lh the 
example dual case, an output monitor could be used on each module. If the 
monitor can detect 971 of module failures, e.g. no output or unreasonable 
output, the monitor provides "coverage", c, of 97%. The transition rate 
f ^2 is then 2fc, so that 97% of the transitions from State 1 go to State 2. 

Of the remaining 3% of the transitions from State 1, some fraction, 
e.g. 2/3, could go to State 3 and the rest to State 4. This would result 
in f^„ being 2f(1-c)(2/3), or 2f(.02), and f^ being 2f(1-c) (1/3), or 
2f( .01 ) . 

Note the distinctions between coverage, which relates to module fail- 
ure detection, and undetected stage failure. Note also that the function 
of a particular stage could be such that it cannot fail undetected, even 
though individual modules within the stage may fail with coverage less than 
1. In other cases, stage failure may be detected only by multiple module 
failures being detected. 

It should also be noted that the sub of transition rates out of State 
1 is 2f. In general, if any state corresponds to N modules working, the 
sum of transition rates out of that state will be Nf. 

It should be noted also that stages can fail for two reasons, spares 
exhaustion or coverage failure. In contemporary aircraft systems having 
critical functions to perform, coverage failures are of as much concern as 
spares exhaustion. 

In the previous dual stage example with 97% coverage of the first 
module failure, no consideration was included of the failure rate of the 
monitor itself. The coverage factor of 97% meat,; that 97% of the module 
faults are of such a nature that they can be detected by an uncalled 
monitor. The rest are outside of the monitors capability. In cases where 

dedicated hardware monitors are used, it is appropriate to consider their 

failure rates and failure modes. A two-state monitor is the type most 

frequently encountered. It provides only a GOOD/BAD signal. Such a 
monitor has only two failure states: false indication of BAD when the 

module is good, and false indication of GOOD when the module is bad. 

The simplest way of treating such monitors in CARSRA is to combine the 
monitors with the modules as a single stage. The transition rate from 



State 1 to Stata 2 la then 2fcr 2f_a, where f and o are as before, r la 

ns i 

the reliability of the monitor over the entire flight time, f^ is the 
monitor failure rate, and a is the fraotion of monitor failures resulting 
in a good module being declared bad. The other transition rates mould be 
similarly defined, recognizing the relation between deteotlon of stag* 
failure and component monitors. Each Instance of such a stage mat be 
evaluated individually in determining the applicable rate formulas. 

Frequently, oertain terms in a rate equation can be ignored because 

they are numerically negligible. For example, if f » 120 x 10~* and f a 

—6 ® 
0.1 x 10“ . the term 2f a can be ignored in the formula 

m 

f 12 * 2fcr m * 

provided c is not absurdly small. If c is 901, a is 501, and the flight 
time is 10 hours, 

f 12 a 2(120 x 1 0~ 6 > ( . 90 ) exp(-. 1 x 10" 6 x 10) 

♦2( . 1 x 10“**)( .50) 
s 216 x 10" 6 + .1 x 10 -6 . 

Inclusion of the term yields a rate of 216.1; ignoring it yields 216. 
The difference is much less than that caused by uncertainty in the module 
failure rate, 120 x 10“®. 

Dependencies 

CARSRA permits the user to describe instances in which failures of a 
module in one stage will prevent a module in another stage from being used. 
An example of this in the REFCS is the portion of each FCC channel which 
receives sensor data and makes it available to the other channels. Data 
Acquisition Card A26 1.. * CC Mo. 1 receives data from the No. 1 unit of each 
triple sensor type, and relays it to another card for transmission to the 
other three channels and for use by its own channel. (Ref. Vol. II, 
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Section 5. 1. 1. 3* 1.5). There are 5 triple-sensor types involved in the 
autoland mode: pitch, roll, and yaw rate gyros; and lateral and normal 

accelerometers. (The A26 card also handles data from other sensors, but 
only these five will be used for disc-ib >i here.) If the A26 card fails 
in FCC Ho. 1, the data will be lost from pitch gyro No. 1, roll gyro Ho. 1, 
yaw rate gyro No. 1, lateral accelerometer No. 1, and normal accelerometer 
No. 1, just as if all 5 of these sensors had failed. The A26 card is 

called a dependency module, and its stage a dependency stage. Eaoh of the 

affected sensors is called a non-dependency module, and the corresponding 

stage a non-dependency stage. 

Coverage for sensor failures is provided by comparison monitoring and 
reconfiguration (Vol. II, Sec. 5. 1.2.4). Each channel independently per- 
forms the sensor monitoring functions on the data it will use in control 
law computations. When a channel detects a failed sensor, it does not 

tranmit the identity of the individual sensor to the other channels. When 
a B channel detects a failure, it does transmit a discrete variable, 
AP.ONEFAIL, to the A channel in the same FCC. The A channel will turn on 
tne NO DUAL annunciation based on its receipt of AP.ONEFAIL from B, or its 
own detection of a sensor failure. The NO DUAL indication is provided to 
inform the crew that the RDFCS is not fail-operational. The No. 1 FCC 
drives the No. 1 Warning Annunciator Indicator (WAI) and the No. 2 FCC 
drives the No. 2 WAI, so that warning will be provided if either channel of 
either FCC detects the failure. 

The sensor monitoring is part of the foreground flight software. Con- 
sequently, for a channel to detect a fault, the CAPS processor must func- 
tion, as must the CAPS bus and portions of the program and data memory. 
These are the same hardware elements which perform other functions, such as 
control law computations and mode logic computatiton. Most faults in these 
circuit will result in a totally debilitated processor, so that the in- 
ability to the monitor sensors is inconsequential. Note also that even if 
one channel does lose the ability to monitor sensors, any one of the other 
three channels can force the NO DUAL warning. 

In light of the foregoing, the only appreciable probability that the 
loss of fail-operational sensor capability will not be annunciated results 
from loss of both WAI. The multiple-function WAI (Ref. Vol. II, Section 


5.16.1) has a unit failure rate prediction of 33 per million hours. The 
failure rate of any one of the 8 warning messages is conservatively taken 
to be one-fourth the unit rate, or 8.3 per million. It may be noted from 
Vol. II, Table 5. 1.4.6 that the FCC activates the NO DUAL message by pro- 
viding a ground to the WAI, so that a broken wire or bad oonneotor oontaot 
would prevent annunciation. A rate of 1.3 per million hours is lnoluded 
for such failures. Also, the Dlsorete Output (A20) and Servo Engage Loglo 
(A28) cards are involved, with failure rates of 2.79 and 2.61 per million 
hours, respectively. Even though only a portion of the failures of these 
cards will affect NO DUAL, the full rate is used. Further analysis could 
reduce this rate substantially. The failure rate for NO DUAL is then 


WAI 
Wiring 
A20 Card 
A28 Card 


8.3 x 10 

1.3 
2.79 
2.61 
15.0 x 10 


-6 


-6 


The probability of failv ® in a 4-hour time period is then 60 x 10 


-6 


The 


Probability of both NO DUAL warnings being lost is the square of this 

.q 

number, 3.6 x 10 . It may be noted from Vol. II, Sec. 5. 1. 2. 3. 1. 1. 3 that 
the test button on the WAI results in the FCC circuitry and the wiring 
being tested as well as the WAI Itself. Thus latent failures are net a 
problem, provided the indicators are tested prior to autoland. 

The factor 3.6 x 10~^ is used as the probability that the first 
failure of a sensor type will not be covered. This does not constitute 
stage failure, either detected or undetected. Undetected stage failure is 
assumed to occur on second failure, provided the first failure was un- 
detected. This is somewhat a misuse of the term "undetected"; the stage 
failure itself is not necessarily undetected, but the Increased likelihood 
of its occurrence, following first failure, is not annunciated. 

This treatment of sensor failures allows the availability feature of 
CARSRA to be used in computing the probability of loss of one sensor prior 
to 150 ft., failure of the NO DUAL annunciation, and another failure below 
150 ft. The aailability feature is discussed in the next section. 
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Availability 


CARSRA permits system reliability to be computed for a mission phase 
which follows a period of operation with less stringent failure criteria. 
An obvious example of this is the RDFCS, which is fall-passive In cruise, 
but must be fall-operational in autoland below ISO ft. The availability 
feature allows the user to specify which modules may be failed at the 
beginning of autoland without forcing diversion to an alternate landing 
site. Each 3uch availability configuration must provide adequate re- 
liability for the landing, although not as much as if everything Is work- 
ing. The RDFCS requires all of the modules used in autoland to be oper- 
ational, so that the availability feature might seem not needed in this 
assessment. It is needed, though, to compensate for a capability which 
CARSRA lacks. 

The reliability of the RDFCS for automatic landing is predicated on 
the system being fail-operational as the alert height is passed. There- 
fore, the probability of the system having a latent failure at 150 ft. and 
a second failure below that point must be quite small. 

By setting up the CARSRA input to allow one sensor of each type to 
fail during cruise, with the transition rate from State 2 to the undetected 

_9 

failure state including the coverage factor of 3.6 x 10 , the undetected 

system failure probability computed by CARSRA will give the probability of 
an undetected latent failure at 150 ft. and a second failure before touch- 
down. (See Figure 27) 

What CARSRA will actually compute is: 

P(0 failures at 4 hours) x P(jndetected failure 
and detected failure between 4 and 4.02 hrs.) 

+P(1 undetected failure at 4 hours) 

x P (second failure between 4 and 4.02 hrs.) 

Since t’.'.e probability of both an undetected and a detected failure 
between 4 and 4.02 hours is very small, the first term is negligible and 
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DETECTED UNDETECTED 


DUAL SENSOR TRIPLE SENSOR 


f 12 

2f 

3f 

f 13 

0 

0 

f u 

0 

0 

F 23 

f 

2f 

f 24 

fa 

2fa 


f = MODULE FAILURE RATE 
a = ANNUNCIATION FACTOR 3.6 x 10" 9 

Figure 27. Marker* Motto I Coding for S«mor Stags* 
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the output will be equal to the second tern, which is the probability 
desired. This approach is used for the undetected (unannunciated) failures 
throughout the system. The definition of stages and the transition rates 
are shown in Figure 28. 

The CARSRA program computed some negative probabilities for the un- 
annunciated failures. It is suspeoted that this may have been oauaed by 
the program being run on a Univac 1100-series computer, which has a 36-bit 
word length. The transition rates to the unannunciated failure states are 
quite small in some cases (1 x lO -1 ^), and addition and subtraction of 
numbers of this magnitude with numbers close to 1.0 could produce seme 
numerical accuracy problems on a 36-bit machine. At NASA-Ames, the program 
is run on a CDC computer, which has a much larger word size, 64 bits, so 
that the problem is thought to be unlikely there. Time was not available 
during th* study to Investigate and resolve the problem, but this will be 
done when possible. 

Because of the nuaerical problem encountered with the CARSRA output, 
the system failure probabilities reported herein were actually manually 
calculated. This was don® by manually computing the stage occupancy 
probabilities, and then combining these probabilities to account for 
dependencies between stages, using the same logic that the CARSRA program 
uses. 

The probability of an undetected failure prior to the crucial phase, 

—14 

followed by a second failure in the crucial phase, is 3*36 x 10 , 

— 1 4 

compared to 2.46 x 10 from the fault trees. The probability of multiple 
failures in the crucial phase, if everything is working just prior to the 

_c _q 

phase, is 0.658 x 10 , compared with 0.638 x 10 from the fault trees. 
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10. COiCUSIOfS 


1 


The conclusions resulting from this study relate to the benefits and 
limitations of the integrated assurance approach used and the RDFCS Simula- 
tor. Certain of the conclusions lead to recomm e ndations . as discussed sub- 
sequently. 

The primary conclus' n drawn from this study is that the integrated 
assurance approach used is workable for a system, such as the RDFCS, which 
employs monitoring totally separate froa the hardware/ software being 
monitored. In the RDFCS, this monitoring includes the servo coll current 
comparators and the modulator piston follow-up monitoring. It also 
Includes the warning annunciations which one FCC can generate following a 
failure in the other FCC. A single-string, self-monitored system might be 
much less amenable to this approach, depending on the monitoring approaches 
used. This possibility is outside the scope of this study. 

Fault tree analysis is > feasible analytical method for system level 
faults. One benefit is that specific software failures are identified as 
the analysis progresses. These can be, and should be, used as a check on 
the validation test case selection to assure that the software function is 
rigorously tested. Fault trees can be extended to the circuit card level 
in a well organized computer such as used in the RDFCS. In general, the 
analysis is facilitated by a design with clearly partitioned and Identifi- 
able functions and interface structure which is consistent for all card 
inputs and outputs. 

Failure mode and effect analysis is more easily accomplished than 
fault trees within the processor itself. This is because of the processor 
being involved in a diverse set of functions defined by the flight 
software. Most individual pin-level faults have many effects. Usually, 
each fault osn be traced to an effect which totally debilitates the 
processor. Other effects which would also cause massive processor failure, 
or erroneous results only ur.der certain conditions do not have to be 
analysed in detail, provided their effects will not propagate across 
channels. In contrast, a fault tree analysis based on loss of required 
system functions would result in identification of the sshsc hardware faults 
time after time.. 
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The FMEA and fault insertion sessions should be on an Iterative basis. 
After beginning the FMEA, a fault insertion session should be used to 
confirm the analysis to that point. The results should then be Incorpor- 
ated in the FMEA and the entire FMEA reviewed in light of those results. 
This review may lead to identification of additional fault cases which 
should be simulated to resolve uncertainty which may have arisen. This 
iterative approach was not feasible in this study because of limitations on 
the availability of the simulator, which was being used on other projects. 

The RDFCS simulator has substantial capability for research investiga- 
tions of digital flight control system validation issues. This capability 
would be significantly improved by an automated fault insertion and data 
recording capability. Such a capability should be preprogrammable with a 
list of faults to be Inserted. It should include means of recording the 
impact of each fault (e.g., changes in the values of discrete variables) 
for many more variables than the A accessible through the CTA's. It should 
allow variables in channels other than the faulted one to be accessed and 
recorded. 

CARSRA, in its present form, should be used with caution when small 
failure rates are involved and when execution is to be on a computer with a 
shorter word length than the 64 bits used in Control Data computers. The 
possibility of erroneous system failure probability values being output 
exists under such conditions. This needs to be explored further. 

Fault tree analysis and CARSRA provide comparable results for rela- 
tively straightforward redundancy conditions, such as the probability of 
multiple failures during the crucial phase when all components are working 
at the beginning of the phase. For more complicated situations, the two 
methods do not agree as clcsely. This is a result of different simplifica- 
tions and assumptions being nade to structure the problem to the two 
methods. For example, the third sensor of a triple sensor set (Figure 1) 
h< s redundant input paths to the computers (the data input sections of the 
two computer B channels) but the other sensors have only a single data path 
(the A channel input sections). Thin is treated correctly in the fault 
trees, bur the redundancy cannot be accounted for in CARSRA. The conserva- 
tive assumption is therefore made tfc.st loss of either B channel sensor 
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input capability will cause ’:;a of the third sensor In all triple sensor 
sets. In validation work, any assumptions required can be made conserva- 
tively so that the computed failure probability Is actually an upper bound 
on the true probability. 
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APPENDIX A. FMEA RESULTS 
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APPENDIX B. FAULT SIMULATION RESULTS 
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APPEMPIX C. PROCESSOR SCHEMATIC DIAGRAM3 
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FIGURE C-l. CONTROL CARD SCHEMATIC DIAGRAM (SHEET 3 of 3) 







C_2. DATA PATH CARD SCHEMATIC DIAGRAM ( SHEET 1 of 3) 

















